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SUMMARY 
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1  INTRODUCTION 

v 

This  Memorandum  describes  research  on  the  use  of  automatic  speech  recog¬ 
nition  in  the  cockpit  of  an  aircraft.  Such  applications  are  commonly  referred 
to  as  Direct  Voice  Input  or  simply  DVI.  The  research  used  a  BAG  1-11  aircraft 
and  was  concerned  with  the  civil  flight  deck.  The  Memorandum  aims  to  draw  out 
the  lessons  learned  and  refer  them  to  military  operations.  Trials  currently 
underway  using  a  Tornado  GRl  are  also  described.  This  military  programme  is 
planned  to  build  directly  on  the  experience  gained  using  the  BAG  1-11. 

2  BACKGROUND 

In  late  1981,  the  RAE  embarked  upon  a  flight  research  programme  to 
investigate  the  potential  benefits  of  using  DVI  on  the  civil  flight  deck.  The 
trials  aircraft  was  the  BAG  1-11  shown  in  Fig  1.  The  role  of  this  aircraft  was 
to  provide  a  flying  laboratory  for  civil  avionics  research.  Various  trials  were 
conducted  at  the  time,  covering  such  research  as  improved  navigation  techniques, 
the  use  of  electronic  cockpit  displays,  enhanced  Flight  Management  Systems  (FMS) 
and  research  on  non-linear  energy  based  control  laws^.  The  advent  of  such  new 
technology  has  had  a  significant  effect  on  the  pilot's  role.  The  days  of  flying 
with  hands  permanently  on  the  control  column  have  become  a  thing  of  the  past. 

The  pilot  has  become  much  more  of  a  systems  manager.  Hands  are  now  increasingly 
required  for  button  selection  on  a  variety  of  onboard  data  entry  keyboards.  The 
majority  of  these  keyboards  are  far  from  ideal  due  to  the  constraints  of  ava. 1- 
able  space  in  suitable  locations  and  the  complexity  of  the  task  for  which  they 
are  required.  As  a  result,  control  functions  can  take  excessive  time  due  to  a 
combination  of  keying  errors  and  protracted  procedures.  Such  operations 
generally  require  the  full  visual  attention  of  the  pilot.  Keeping  a  good  look 
out  is  affected  as  a  result,  as  is  the  ability  to  monitor  other  cockpit  display 
surfaces. 

Many  avenues  could  have  been  explored  in  improving  the  cockpit  management 
procedures,  such  as  better  designed  and  positioned  keyboards,  or  the  use  of 
joysticks,  rollerballs  and  touch  sensitive  screens.  All  of  these  would  have 
required  significant  cockpit  re-design;  DVI  would  have  little  impact  on  the 
cockpit  real  estate,  although  some  of  the  systems  implications  would  be  pro¬ 
found.  The  location  of  the  Automatic  Speech  Recogniser  (ASR)  is  unimportant. 

The  microphone  requirements  are  similar  to  those  of  communications.  The  main 
additional  requirements  are  a  switch  to  engage  DVI  and  a  display  to  present  the 
output  of  the  recogniser. 
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DVI,  untried  before  in  a  cockpit  environment,  appeared  to  offer  three  main 
advantages,  namely  convenience,  speed  and  freeuom.  It  is  obviously  more  con¬ 
venient  and  faster  for  humans  to  communicate  by  speaking  to  each  other,  this 
method  of  communication  allowing  the  hands  or  eyes  freedom  to  conduct  other 
tasks  at  the  same  time.  The  problems  of  DVI  were  likely  to  be  recognition  per¬ 
formance  and  restrictions  on  vocabulary  imposed  by  the  equipment.  The  purpose 
of  the  flight  trials  onboard  the  BAG  1-11  was  to  test  whether  the  advantages 
could  be  realised  in  flight  conditions,  whether  the  advantages  outweighed  the 
disadvantages,  what  recognition  performance  was  required  for  operational  use  and 
to  establish  guidelines  for  system  integration. 

3  TriE  BAG  1-11  TRIAL 

The  prime  requirement  for  this  trial  to  commence  was  an  ASR.  In  late  1981 
no  suitable  UK.  device  existed  and  an  American  Threshold  T-500  isolated  word 
recognlser  was  used  to  gain  experience. 

Initially,  the  main  interest  was  to  characterise  recognition  performance 
in  the  airborne  environment.  At  that  time,  no  performar.ee  figures  for  recog¬ 
nition  during  flight  existed.  Nor,  indeed,  did  any  standard  test.  Thus,  a  test 
procedure  had  to  be  devised.  It  is  worth  remembering  that  recognition  results 
obtained  from  any  ASR  will  depend  on  various  parameters.  The  size  of  the  active 
vocabulary,  how  similar  words  sound  in  the  vocabulary,  the  level  of  background 

noise  which  could  affect  the  input  signal  or  change  the  way  the  speaker  talks 
2 

are  examples  .  Recognition  performance  will  also  depend  on  the  user,  DVI  being 
ideal  for  consistent  speakers  but  less  so  for  others. 

The  initial  test  vocabulary  consisted  of  the  IGAO  phonetic  alphabet,  alpha 
to  Zulu  and  the  digits  one  to  twenty-six.  This  test  vocabulary  was  intended  to 
consist  of  words  which  would  be  easily  understood  by  humans  over  a  radio  link  as 
well  as  words  that  could  be  confused,  such  as  five  and  nine.  Prior  to  the  test 
commencing,  each  user  trained  the  recogniser  following  the  procedure  laid  down 
by  the  manufacturer.  This  involved  saying  each  word  ten  times.  An  average 
utterance  from  these  ten  samples  was  then  calculated  by  the  recogniser.  The 
duration  of  this  training  session  was  found  by  all  users  to  be  too  long.  There 
was  a  danger  that  the  later  samples  would  be  unrepresentative  due  to  boredom, 
resulting  in  poor  templates.  In  conclusion,  training  or  enrolment  sessions  must 
be  fairly  short;  reading  a  short  passage  of  speech  or  saying  one  sample  of  each 
word  would  be  preferable. 


UNLIMITED 


TM  FM  43 


UNLIMITED 


0 


During  the  test,  conducted  in  the  quiet  of  the  laboratory  and  then  in  the 
BAG  1-11  during  flight,  the  user  Wv>uld  say  each  allowable  word  in  isolation, 
following  the  order  of  Alpha,  Bravo,  Charlie  etc.  This  was  repeated  twenty 
times,  giving  1040  utterances  for  a  test.  Percentage  errors  were  then  calcu¬ 
lated  and  a  confusion  matrix  was  produced  to  Illustrate  the  pattern  of  the 
errors. 


Many  results  could  be  presented  for  the  various  speakers  undertaking  these 
tests.  For  convenience,  a  brief  summary  is  given  in  Table  1,  indicating  that 
the  performance  obtained  during  flight  was  unacceptable  and  much  worse  than  in 
the  laboratory.  The  main  difference  between  laboratory  and  flight  tests  was 
f^nt  the  background  noise  level  was  much  greater  in  flight;  typical  noise  levels 
in  the  aircraft  are  given  in  Table  2.  Although  the  Threshold  T-500  provided  a 
useful  benchmark  for  other  recognlsers,  its  performance  was  inadequale  for  use 
in  the  cockpit.  This  preliminary  trial  also  showed  that  isolated  word  recog¬ 
nition  was  completely  unsuitable  for  inputting  strings  of  data  such  as  radio 
frequencies  and  latitudes  and  longitudes.  The  aircrew  found  isolated  recog¬ 
nition  to  be  slow,  unnatural,  irritating  and  distracting. 

Before  integration  with  the  avionics  could  take  place,  the  ASR  required 
background  noise  compensation,  and  a  connected  word  capability.  The  Marconi 
SR  128  was  developed  to  meet  these  requirements  and  is  illustrated  in  Fig  2. 
Although  the  SR  128  required  the  user  to  update  the  background  noise  mask 
manually,  the  very  fact  that  some  form  of  compensation  had  been  implemented  was 
a  move  in  the  right  direction.  Additionally,  a  larger  vocabulary  of  240  words 
or  phrases  was  provided  and  the  training  routine  only  required  one  sample  per 
utterance.  A  programmable  syntax  enabled  the  vocabulary  to  be  structured  such 
that  only  relevant  words  were  considered  in  the  recognition  processing;  this  not 
only  maximised  the  recognition  performance,  it  also  reduced  the  time  taken  for 
recognition.  The  SR  128  was  delivered  to  RAE  in  early  1982. 

The  same  tests  as  previously  described  were  conducted.  A  summary  of  these 
are  shown  in  Table  3.  The  results  were  very  encouraging,  certainly  adequate  to 
attempt  some  system  integration.  The  programme  of  system  Integration  was 
defined  in  three  phases: 

Phase  1  -  Integrate  DVI  with  the  onboard  electronic  displays. 

Phase  2  -  If  the  pilot  opinion  was  favourable  towards  DVI,  integrate  it 
with  a  Radio  and  navigation  Management  System  (RMS). 
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Phase  3  -  Control  the  route  change  procedures  and  other  flight  conditions, 
such  as  speed  and  height,  by  DVI  through  the  FMS. 

4  BAG  1-11  TRIAL  PHASE  1 

The  port  side  of  the  BAG  1-11  cockpit  had  been  converted  to  electronic 
colour  displays,  as  shown  in  Fig  3.  Each  display  could  show  not  only  the  pri¬ 
mary  information  but  also  =  simple  electronic  map  representing  the  aircraft 
position  in  plan  or  elevation  along  its  pre-programmed  route.  Prior  to  using 
DVI,  the  pilot's  control  of  this  map  was  restricted  to  switches  on  the  coaming 
and  centre  pedestal.  The  map  was  generated  by  a  programme  residing  in  a  General 
Purpose  digital  Computer  (GPC).  Although  the  task  of  changing  formats  or 
selecting  information  was  not  causing  any  significant  workload,  this  initial 
phase  could  answer  the  question,  "Do  you  like  using  DVI?"  Also,  the  act  of 
coupling  the  SR  128  to  the  GPC  enabled  some  of  the  basic  issues  of  integrating 
DVI  into  the  cockpit  to  be  addressed. 

On  recognition  of  a  word  or  phrase,  the  SR  128  would  output  the  code  for 
the  word  or  phrase  chosen,  the  text  of  that  word  or  phrase  and  the  score  of  the 
match  with  the  corresponding  training  template.  Two  methods  of  integration  were 
therefore  considered: 

(1)  Modify  the  programme  in  the  GPC  to  take  account  of  the  serial  data  sent  by 
the  ASR  or 

(2)  introduce  some  downstream  processing  between  the  ASR  and  the  GPC.  This 
Inhouse  microprocessor  based  system  would  emulate  the  discrete  signals 
which  defined  the  configuration  of  the  electronic  map,  normally  sent  to 
the  GPC  directly.  The  programme  within  the  GPC  could  therefore  remain 
unchanged. 

As  further  integration  with  other  avionics  was  considered,  method  (2)  was 
chosen  because  it  offered  greater  flexibility.  The  general  layout  of  the  DVI 
system  onboard  the  BAG  1-11  is  shown  in  Fig  4.  This  working  system  was 
demonstrated  at  the  SBAC  Air  Show  during  September  1982. 

The  vocabulary  required  for  control  of  the  Electronic  Map  was  36  words, 
well  within  the  240  word  capability  of  the  SR  128.  A  few  of  these  words  and  the 
resulting  actions  are  indicated  below: 

"NORTH-UF"  -  selects  map  to  North-up  orientation. 
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"NAVAIDS"  -  displays  Che  navaids  in  the  coverage  of  the  selected  nap 

scale. 

"THREE-HUNDRED"  -  selects  Che  300  nautical  mile  scale. 

"LOOK-AHEAD"  -  looks  ahead  to  the  end  of  the  pre-programmed  flight 

p  Ian. 

"STEP-BACK"  -  steps  back  progressively  from  the  destination  airfield 

until  "STOP"  is  said. 

Isolated  commands  were  generally  used  in  this  phase  and  the  effect  of  con¬ 
nected  speech  recognition  was  not  explored.  However,  a  number  of  important 
lessons  were  learnt  from  this  initial  use  of  DVI.  These  have  been  listed  below. 

(1)  Some  form  of  recognition  feedback  is  required  by  the  user.  Originally  the 
only  feedback  available  to  the  pilot  was  the  state  of  Che  electronic 
display.  If  a  command  was  given  and  the  display  did  not  change,  Che  pilot 
had  no  idea  if  the  problem  was  speech  recognition  or  in  Che  downstream 
processing.  So  that  he  could  resolve  this,  a  small  LCD  display  showing 
the  ASR  output  was  placed  in  front  of  the  pilot  on  the  coaming.  This 
feedback  could  be  even  more  essential  for  flight  critical  tasks,  such  as 
weapons  management  in  the  military  aircraft.  Audio  feedback  was  con¬ 
sidered,  but  in  response  to  pilot  opinion,  was  never  installed.  It  was 
felt  that  there  was  already  a  surfeit  of  audio  activity  on  the  flight  deck 
without  adding  to  tho  problem.  Tt>is  sit"'’t’on  differs  from  Che  military 
cockpit  where  ther*^  is  a  need  for  head-out/hands  busy  operation.  This 
need,  coupled  to  radio  silence  or  minimised  radio  communication,  makes 
audio  feedback  a  potentially  valuable  option.  One  merit  of  a  visual 
display  over  audio  feedback  is  that  Che  message  remain^  available  ove-  a 
period  of  time  and  the  pilot  can  decide  when  to  read  it. 

(2)  The  use  of  a  press  and  hold  switch  was  found  to  be  best  for  engaging  the 
speech  recogniser  since  the  pilots  were  already  familiar  with  this  con¬ 
cept.  The  use  of  keywords,  to  alert  the  ASR  that  a  DVI  command  was  about 
to  be  given,  was  disliked.  The  use  of  a  keyword  takes  longer  than  using  a 
switch  and  requires  the  pilot  Co  remember  to  use  a  further  keyword  to 
disengage  the  system. 

(3)  Commands  should  be  concise  but,  within  this  constraint,  should  correspond 
as  closely  as  possible  to  the  natural  language  of  Che  cockpit.  For 
example,  it  is  undesirable  to  issue  the  following  instruction  via  Che  ASR: 
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"WOULD  YOU  MIND  TUNING  VHP  R.ADIO  THREE  TO  ONE  THREE  OH  POINT  SEVEN". 

Much  berter  would  be; 

"BOX  THREE  ONE  THREE  ZERO  DECIMAL  SEVEN  ENTER", 

Occasionally,  words  that  are  In  normal  aircrew  usage  might  cause  recog¬ 
nition  problems,  such  as  "FIX"  and  "SIX"  which  are  mutually  conlusable.  The  use 
of  syntax  can  sometimes  resolve  such  confusions.  For  example,  the  word  "FIX" 
might  not  be  required  at  the  same  branch  in  the  syntax  as  the  digits.  As  a  la^r 
resort,  a  word  might  require  changing.  For  example,  change  the  word  "FIX"  to 
"PLOT".  The  use  of  familiar  vocabulary  will  reduce  the  memory  load  on  the 
p  i  lot . 

The  recognition  results  for  two  of  the  pilots  using  DVI  to  control  tile 
electronic  map  are  shown  in  Table  4.  The  overall  feeling  was  that  DVI  providec. 
a  useful  additional  method  of  controlling  a  system.  All  pilots  were  in  favour 
of  extending  the  use  of  DVI  to  the  RMS. 

5  BAG  1-il  TRIAL  PHASE  2 

During  the  early  part  of  1983,  an  experimental  RMS  was  installed  into  t ne 
cockpit  of  the  BAG  i-11.  The  two  principal  aims  of  this  device  were  (a)  to 
reduce  the  space  required  by  the  conventional  controllers  and  (b)  to  ease  the 
tuning  of  radio  and  navigation  aids.  The  interface  between  the  pilot  and  the 
radio/navald  fit  was  the  Integrated  Gontrol  and  Display  Unit  (IGDU)  shown  in 
Fig  0.  By  use  of  the  GRT  and  its  associated  keyboard,  the  user  could  control 
the  state  of  10  transmitter/receivers,  as  well  as  selecting  the  various 
check-lists  during  the  flight. 

The  unit  saves  space  in  the  already  crowded  cx.ck,.it  but  all  pil'^ts  cc^- 
mented  that  the  IGDU  was  more  difficult  to  use  than  the  dedicated  controllers. 
The  pilot  now  had  to  remember  the  correct  button  sequence  to  select  the  required 
page,  before  data  entry  commenced.  DVI  was  used  to  replace  long  sequence  by 
explicit  instructions. 

At  the  rear  of  the  IGDU  there  was  a  remote  keyboard  port  available.  By  a 
process  of  'pln-shorting' ,  each  key  push  could  be  emulated.  (At  a  later  date,  a 
serial  interface  was  added  to  the  IGDU  allowing  the  ASR  to  interface  with  the 
IGDU  directly).  The  downstream  processing  between  the  ASR  and  the  IGDU  was 
programmed  to  emulate  these  'pin-shorts'  when  a  word  was  received  from  the  ASR. 
The  pilot  now  simply  asked  for  the  correct  page  and  this  was  found  directly.  A 
typical  command  would  be  "BOX  3  130  DECIMAL  7  ENTER";  regardless  of  the  current 
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page,  VHF  radio  ^  ge  was  selected  directly.  The  frequency  line  was  auto- 
natically  chosen,  and  the  frequency  of  130.7  MHz  was  inserted.  If  recognition 
was  correcc,  the  pilot  would  then  activate  this  frequency  by  saying  "ENTER". 

Various  points  arise  from  this  example;  the  pilots  used  the  word  "BOX" 
instead  of  Radio,  the  former  being  an  example  of  conventional  usage  on  the 

flight  deck.  The  pilots  preferred  to  say  the  digits  as  in  nonaal  convetsat ion. 

FIVE  not  FIFE.  The  word  ''ENTEK"  acted  as  the  e xe  cu  t i ve  word,  i nd i c  a  t i ng  t  n u  r 
the  frequency  string  was  complete  and  a  retune  was  required.  Another  method  of 
carrying  our  this  could  have  been  to  use  the  state  of  the  DVI  activation  swircn; 
on  release  of  the  switch,  it  could  be  assumed  that  the  command  was  complete  and 

tuning  was  required.  Generally,  this  assumption  would  be  correct  but  a  degree 

of  caution  is  always  required.  Pilots  commented  that  having  to  say  "ENTER"  was 
acceptable  and  gave  them  the  confidence  of  being  in  control. 

The  integration  of  DVI  with  the  ICDU  was  seen  to  be  highly  beneficial;  the 
system  was  nor  only  a  'space  saver'  but  also  easier  to  use.  The  number  of 
button  selections  that  could  be  saved  by  using  DVI  is  demonstrated  by  the 
following  example:  "ILS  PAIR  108  DECIMAL  3  ENTER".  The  tuning  of  both 
Instrument  Landing  Systeia  (ILS)  receivers  can  take  a  minimum  of  lb  button  selec¬ 
tions,  even  with  the  simple  keyboard  of  the  ICDU.  By  saying  "PAIR",  both 
receivers  are  tuned  simultaneously. 

These  are  examples  of  the  many  features  available  and  a  full  description 
can  be  found  in  Ref  3.  One  feature  that  is  worth  describing  here  is  the  tuning 
of  radios  and  navaids  bv  name.  DVI  affords  a  method  which  would  be  impracti¬ 
cable  to  achieve  by  any  other  means.  "BO.K  2  BEDFORD  TOWER",  "VOR  DAVEN'TRY"  and 
"SQUAWK  RADIO  FAILURE"  serve  as  examples  of  typical  commands  chat  were  used  in 
the  Phase  2  programme. 

The  size  of  vocabulary  required  Co  control  the  electronic  colour  displays 
and  the  RMS  had  increased  to  106  words.  The  recognition  performance  with  all  of 
these  words  active  was  found  to  be  unacceptable,  so  a  syntax  was  prouraramed  to 
reduce  the  size  of  the  active  vocabularies.  (Note:  -  "''he  vocabulary  used  for 
the  electronic  colour  display  In  Phase  1  did  not  use  syntax).  A  number  of 
points  arose  from  the  use  of  syntax  and  are  itemised  below: 

(1)  Segmenting  the  vocabulary  into  the  respective  areas,  in  this  case  display 

and  RMS,  improves  recognition  significantly. 
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(2)  Pilocs  did  nor  like  saying  rhe  keywords,  display  or  coraras  before  r  iie  naii 
part  of  the  command  commenced.  If  keywords  are  to  be  used,  they  must  be 
part  of  the  normal  command. 

(3)  The  syntax  must  be  as  transparent  as  possible  to  the  user.  If  the  user 
has  to  remember  the  exact  way  of  saying  the  command,  just  to  follow  some 
unnatural  syntax,  the  convenience  of  DVI  can  soon  disappear. 

(4)  The  user  should  be  able  to  exit  from  the  syntax  at  any  point  by  using  a 
special  word.  This  is  vital  if  an  incorrect  syntax  branch  is  entered.  In 
the  case  of  the  BAG  1-11  the  word  used  was  "RESTART". 

(5)  Syntax  can  greatly  aid  connected  digit  recognition,  especially  when  fre¬ 
quency  entry  is  considered.  If  "oOX  3"  is  said,  this  indicates  that  a  VHF 
frequency  will  follow;  the  first  digit  will  therefore  be  ONE,  and  the 
second  digit  will  be  ONE,  TWO  or  THREE. 

The  general  feeling  of  all  pilots  using  DVI  to  control  the  RMS  was 
favourable.  It  was  becoming  more  apparent  that  some  pilots  obtained  better 
recognition  than  others.  The  range  of  word  error  rates  covered  l:'J  to  4,1.  The 
pilot  who  obtained  4,1  considered  his  performance  to  be  unacceptable.  However, 
he  believed  that  DVI  was  a  technique  that  should  be  pursued,  since  DVI  would 
offer  benefits  to  other  users.  (It  was  later  shown  that  an  ASR  with  some  form 
jf  Automatic  Gain  Control  (AGC)  on  the  speech  input  reduced  this  particular 
pilot's  error  rate  dramatically). 

The  choice  of  error  correction  strategies  is  a  function  of  recognition 
performance  and  the  application.  Miithods  investigated  as  part  of  the  Phase  2 
programme  were  as  follows: 

(1)  Say  a  special  word  to  indicate  that  an  ASR  or  user  error  had  taken  place; 
in  the  case  of  the  BAG  1-11  the  word  "CANCEL"  was  used.  After  this  repeat 
the  complete  digit  string. 

(2)  Say  a  special  word  or  phrase  to  indicate  where  the  error  had  occurred  in 
the  string.  This  might  be  the  phrase  "BACK  3"  which  would  have  a  similar 
action  as  pressing,  3  times,  a  left  cursor  control  key  on  a  conventional 
VDU.  The  digit  was  then  corrected  when  it  was  highlighted. 

(3)  Use  a  combination  of  method  (1)  and  (2). 

Each  method  has  advantages  depending  on  the  digit  string  length  and  where 
the  error  has  occurred.  However,  the  pilots  were  unanimous  in  their  preference 
for  method  (1).  They  felt  that  repeating  whole  digit  strings  was  most  typical 
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of  human  behaviour.  A  telephone  number  incorrectly  heard  comes  to  mind.  This 
may  not  be  the  case  in  military  applications  where  longer  strings  and  poorer 
recognition  may  dictate  ' he  need  for  a  segmented  approach. 

6  BAC  1-11  TRIAL  PHASE  3 

The  final  phase  of  the  BAC  1-11  trial  was  to  be  the  most  convincing  use  of 
DVI.  The  number  of  button  selections  required  for  route  changing  through  the 
onboard  FM5  had  always  been  a  source  of  pilot  criticism.  DVI  could  be  seen  to 
assist  in  this  problem  along  with  such  devices  as  e  joystick  and  a  rollerball. 

An  example  of  the  comparison  between  button  selection  and  the  use  of  DVI  is 
given  in  Fig  6.  In  this  example,  Che  pilot  has  been  given  a  route  instruction 
by  ATC  to  fly  direct  from  Che  present  aircraft  position  Co  a  waypoint,  ident¬ 
ified  as  COR.  In  this  instance,  COR  is  not  part  of  the  database  of  known 
waypoints  held  in  the  FMS  memory.  The  sequence  of  button  selections  is  self 
explanatory.  A  point  worth  noting  is  the  number  of  button  selections  required 
for  Che  waypoint  identifier  alone;  a  case  where  multi-function  keys  can  save 
space  tut  can  also  result  in  a  large  number  of  button  selections  being  required. 
For  this  application  DVI  was  found  to  be  twice  as  fast  and  to  be  less  frustrat¬ 
ing  for  Che  pilot  Co  use. 

Whilst  the  voice  command  was  being  given,  feedback  was  presented  in  three 
forms.  Along  with  the  normal  recognition  feedback  display  mounted  on  the 
coaming,  Che  pilots  received  feedback  from  the  FMS  Control  and  Display  Unit 
(CDU)  and  the  electronic  map.  The  electronic  map  showed  the  route  change 
requested,  known  as  Che  secondary  route,  in  a  different  colour  as  the  command 
was  being  given.  If  Che  pilot  described  the  latitude  or  longitude  incorrectly, 
the  secondary  route  could  indicate  gross  errors  pictorally.  This  error  would 
not  be  as  obvious  if  purely  represented  by  a  digit  string  on  the  FMS  CDU.  Once 
the  route  change  request  had  been  completed,  the  pilot  would  then  say  "EXECUTE"; 
the  secondary  route  then  became  the  primary  route  followed  by  the  aircraft.  The 
change  of  direction  of  the  aircraft  could  be  regarded  as  a  fourth  form  cf 
f  eedback . 

The  combination  of  DVI  with  a  joystick  or  rollerball  was  ideal  for  other 
route  change  procedures.  The  insertion  of  a  number  of  new  waypoints,  which  were 
held  in  the  FMS  memory,  could  bt  carried  out  by  using  two  techniques.  These  are 
Indicated  by  two  typical  voice  instructions: 
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(1)  "WAYPOINT  ALPHA  BRAVO  CHARLIE  INSERT  DELTA  ECHO  FOITROT  INSERT  GOLF 

HOTEL  INDIA  EXECUTE"  or 

(2)  "WAYPOINT  ALPHA  BRAVO  CHARLIE  INSERT  JOYSTICK". 

In  the  first  example,  the  pilot  sequentially  says  the  identifiers  of  the 
new  waypoints  to  be  inserted.  In  the  second  example,  DVI  is  used  to  Indicate 
where  the  Insertion  should  take  place  and  instruct  a  joystick  entry  procedure  to 
commence.  These  new  waypoints  are  chosen  by  moving  a  cross  cursor  or  the 
electronic  map  over  the  displayed  waypoints  required,  using  the  joystick.  The 
insertion  is  completed  by  pressing  a  button  positioned  on  the  joystick.  .An 
alternative  to  the  joystick  was  a  rollerball  mounted  on  the  pilot's  right  arm¬ 
rest.  The  combination  of  voice  and  tactile  control  was  liked  by  all  pilots  who 
flew  it  in  the  BAC  1-11. 

Along  with  the  various  other  route  change  facilities  such  as  inserting 
holding  patterns  or  removing  part  of  the  pre-programmed  route,  the  pilots  could 
insert  cruise  levels,  headings,  speed  and  power  settings  by  DVI.  A  vocabulary 
size  of  235  words  was  required  for  electronic  display,  RMS  and  FMS  control. 

Word  recognition  error  rate  for  the  pilots  ranged  from  <  1%  to  6%  when  using  a 
structured  syntax  which  was  suggested  by  users  and  which  they  considered 
natural.  The  maximum  number  of  words  active  at  a  branch  in  this  syntax  was  47, 
the  ICAO  phonetic  alphabet  and  the  digits  accounting  for  36  of  these  words. 
Although  DVI  was  still  regarded  as  a  very  useful  interfacing  technique,  a  few 
problems  had  become  apparent. 

With  the  SR  128  as  the  ASR,  recognition  response  times  for  some  commands 
had  become  excessive.  The  response  time  from  Che  end  of  the  DVI  command  is  not 
only  a  function  of  the  algorithm  and  technology  used,  but  also  a  function  of  the 
number  of  words  active  at  Che  syntax  branches  and  the  number  of  words  in  the 
command.  The  single  command  "TIME"  is  matched  against  27  other  words  in  the 
syntax  and  has  a  response  time  of  0.5  sec.  The  command  "GO  DIRECT  DELTA  TANGO 
YANKEE"  IS  matched  against  74  other  words  in  total  and  the  response  time  is 
2.6  sec  as  a  consequence.  (In  both  of  these  examples,  0.4  sec  of  'silence', 
required  before  the  recognition  process  begins,  has  been  included).  The  next 
generation  of  ASR  would  need  to  have  quicker  response  times,  ideally  0.25  sec. 
The  current  generation  of  recognisers,  replacing  the  SR  128,  has  little  problem 
meeting  this. 
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The  other  area  causing  concern  was  not  connected  with  ASR  performance. 
Despite  the  words  and  syntax  being  chosen  by  the  pilots,  for  larger  vocabularies 
there  had  become  a  tendency  for  them  to  forget  the  details  of  commands  to  be 
given.  This  human  deficiency  was  noticed  mainly  during  flight  demonstrations  or 
during  periods  of  high  workload.  On  one  occasion  a  pilot  asked  the  author  over 
the  aircraft  intercom,  "What  do  I  have  to  say  to  Go  Direct  to  Whiskey  India 
Tango?"  The  author  replied,  "Why  don't  you  say.  Go  Direct  Whiskey  India  Tango?" 
Even  the  pilot's  question  had  not  reminded  hia  of  what  was  required.  In  the 
short  term,  this  problem  can  be  addressed  to  some  extent  by  automatic  prompting. 
A  longer  term  solution  would  be  to  use  recognisers  with  more  intelligence  that 
could  interpret  what  the  pilot  meant.  There  is  no  problem  for  small  vocabu¬ 
laries  which  the  pilot  can  remember.  Also,  very  large  vocabularies,  which  the 
pilot  is  unlikely  to  exceed,  will  probably  present  few  problems.  The  real  dif¬ 
ficulty  resides  with  intermediate  size  vocabularies  such  as  that  used  in  the 
Phase  3  trials. 

Although  I  have  shown  that  using  DVI  in  the  cockpit  is  feasible,  certainly 
in  the  relative  quiet  of  the  civil  cockpit,  1  have  indicated  where  improvements 
are  required.  Some  of  these  Improvements  would  be  essential  for  DVI  to  be 
viable  in  the  military  cockpit  . 

7  TORNADO  DVI  TRIAL  AT  RAE  BEDFORD 

At  the  time  of  writing  this  Memorandum,  a  trial  based  on  board  a  trainer 
version  of  a  Tornado  GRl  has  commenced  at  RAE  Bedford.  This  trial  will  take  DVI 
into  a  more  testing  environment  than  that  encountered  in  the  BAG  1-11.  Factors 
such  as  increased  cockpit  noise,  high  'g'  and  vibration,  breath  noise  along  with 
impulse  noises  produced  by  the  oxygen  mask  exhaust  valve,  must  be  considered. 
Added  to  this  list  of  system  difficulties,  must  be  those  associated  with  the 
missions  conducted  in  this  type  of  aircraft,  often  resulting  in  frequent  high 
stress  states  in  the  crew.  The  potential  benefits  offered  by  DVI  are  primarily 
system  control  when  the  pilot's  hands  and  eyes  are  busy  with  tasks  outside  and 
within  the  cockpit.  There  are  also  military  benefits  in  being  able  to  short 
circuit  long  keying  sequences  and  being  able  to  reduce  the  need  for  keyboard 
functions  to  occupy  prime  space  within  the  cockpit.  Although  DVI  seems  to  be  a 
necessity  in  this  very  high  workload  situation,  it  must  work  well  to  be  effec¬ 
tive.  The  trials  at  RAE  will  attempt  to  show  that  this  can  be  achieved  with 
current  technology. 
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Trials  so  far  have  coacentrated  on  assessing  the  recognition  performance 
under  several  flight  conditions  with  the  next  generation  commercial  recognition 
system  manufactured  by  Marconi.  This  ASR  is  called  Macrospeak.  Macrospeak 
offers  many  of  Che  improvements  Chat  were  required  as  a  result  of  the  BAG  1-11 
trial.  Some  of  these  improvements  are  faster  response  time,  larger  vocabulary, 
adaptive  background  noise  compensation,  automatic  gain  control  on  the  input 
signal  and  a  more  flexible  syntax  structure.  Even  with  all  these  added 
features,  the  recognition  algorithm  remains  similar  to  Chat  of  the  SR  128.  As 
a  result,  flight  tests  in  the  Tornado  using  no  syntax  have  produced  unacceptable 
recognition  performance  for  some  crew  members.  This  is  true  for  not  all  but  a 
few  of  Che  commands  used. 

As  a  consequence,  Che  next  trial  will  use  a  flightworthy  advanced 
recogniser  called  the  ASR  1000.  A  history  of  Che  development  of  this  ASR  was 
recently  reported  in  the  periodical  Speech  Technology^.  The  recognition  per¬ 
formance  of  Che  ASR  1000  will  be  assessed  during  flight,  commencing  August  1990. 
It  is  expected  that  the  new  algorithm  based  on  Hidden  Markov  Modelling  (HMM) 
will  result  in  acceptable  performance  for  the  vast  majority  of  crew  members 
under  a  wide  range  of  conditions.  If  this  proves  to  be  the  case,  the  prospects 
for  the  operational  use  of  DVI  in  military  aircraft  will  be  promising. 

One  potential  application  for  the  military  use  of  DVI  has  been  identified 
by  both  RAE  and  service  pilots.  This  is  the  interface  between  the  rear  crew 
member  of  Che  Tornado  and  Che  aircraft's  main  computer.  The  problems  Co  be 
overcome  include  the  need  for  too  many  button  pushes  and  the  attendant  keying 
errors.  The  keying  procedures  also  often  take  longer  than  the  time  available. 
These  are  all  Issues  Chat  have  been  addressed  by  the  BAG  1-11  programme, 
permitting  the  Tornado  trials  to  benefit  extensively  from  the  earlier  activity. 
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SUMMARY  OF  RECOGN  ITIOIM  RESULTS  . 

ZITHRESHOLD  T500  z: 


Conditions 

Sample  size 

4)  Error  rate 

Laboratory 

11,460 

9 

Flight 

3,120 

37 

^Test  vocabula ry  -  Alphanumerics  ) 


TABLE  1 


COC K PI T  SOUND  PRESSURE  LEVELS 


RAE/BAC1-n- 


On  ground  (equipment  on)  — 
During  flight  130  K  IAS 

210  K  IAS 
300K  IAS 


72dbA 

73dbA 

78dbA 

85dbA 


TABLE  2 
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SUMMARY  OF  RECOGIMITION  RESULTS  . 

IZ  MSRS  SR128=: 


Conditions 

Microphone  used 

4  Error  rate. 

Laboratory 

Shure  SM  10  boom 

0-6 

Airlite  62  boom 

0-9 

Amplivox  throat 

0-7 

Oxygen  mask  V2 

1*3 

Flight 

Shure  SM  10  boom 

0*6 

(250K  IAS) 

Airlite  62  boom 

2-1 

(Test  vocabulary  -  Alphanu  mer  ics  ) 


TABLES 
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SUMMARY  OF  RECOGNITIQM  RESULTS  . 

=  MSRS  SR  128- Displays  Operation  in  Flight.- 


Microphone  used 

Pilot 

A 

Pilot 

B 

O/ 

/^error  rate 

sample 

O/ 

/q error  rate 

sample 

Airlite  62  boom 

1-76 

1251 

1-96 

1073 

Oxygen  mask 

0-78 

128 

2-88 

417 

Throat  mic. 

3-77 

265 

3-31 

242 

TABLE  4 
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Fig  5  Radio  nianagement  system  cc^ntrol  unit 


FMS  KEYBOARD  . 


D.  V.  I  . 


1-  Select  Route  Change  page  . 

2'  Select  Insert  Wpt.  option  . 

3-  Select  After  A/C  posn.  ident. 

4- Keyin  COR'code. 


Press  DVI  activation  switch. 
Say  'GoDirect  CharlieOscar 
Romeo  Ente  r.' 


1 

ABC 


-  4  pushes 


5 

MIMO 


6 

PQR 


5-  Select  Insert  button. 

Pos i t  ion  of 'COR  '  requ  i  red  ?? 

6-  Select  Lat itude  index. 

7-  Key  in  N  40  37  8 

8- Select  Insert  button. 

9-  Key  in  E  27  50  9 
■'O-Select  Insert  button. 


11  - 

Execute  button. 

12- 

•>» 

Route  Change  page . 

13- 

Go  Direct  option  . 

14- 

99 

From  A/C  posn.  ident 

15- 

99 

COR  '  =  To  Waypoi  nt . 

16- 

99 

Execute  button . 

TOTAL  No.-37button  pushes. 


Say  North  40  37decimal  8 


Say  East  27  BOdecimal  9 
Entnr 


t  t 

Say  Execute 


Fig  6  Actions  required  -  'COR'  not  in  FMS  database 
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